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q\ . ABSTRACT 

(~^) ' We present a fast and efficient hybrid algorithm for selecting exoplanetary candidates 

\Q , from wide-field transit surveys. Our method is based on the widely-used SysRem and 

■ Box Least-Squares (BLS) algorithms. Patterns of systematic error that are common to 

all stars on the frame are mapped and eliminated using the SysRem algorithm. The 
remaining systematic errors caused by spatially localised flat-fielding and other errors 
are quantified using a boxcar-smoothing method. We show that the dimensions of the 
search-parameter space can be reduced greatly by carrying out an initial BLS search 
4-> ■ on a coarse grid of reduced dimensions, followed by Newton-Raphson refinement of the 

transit parameters in the vicinity of the most significant solutions. We illustrate the 
method's operation by applying it to data from one field of the SuperWASP survey 
comprising 2300 observations of 7840 stars brighter than V — 13.0. We identify 11 
likely transit candidates. We reject stars that exhibit significant ellipsoidal variations 
indicative of a stellar-mass companion. We use colours and proper motions from the 
2MASS and USNO-B1.0 surveys to estimate the stellar parameters and the companion 
radius. We find that two stars showing unambiguous transit signals pass all these tests, 
and so qualify for detailed high-resolution spectroscopic follow-up. 

Key words: methods: data analysis - techniques: photometric - stars: planetary 
systems 



1 INTRODUCTION 

Among the 194 extra-solar planets discovered in the last 
decade, the "hot Jupiters" currently present the greatest 
challenges to understanding and the greatest observational 
rewards. Efforts to explain their origin have transformed the- 
ories of planetary-system formation. Do these planets form 
via gravitational instability in cold discs, or must they form 
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by core accretion beyond the ice boundary, where ice mantles 
on dust grains allow rapid agglomeration of a massive core 
of heavy elements? How rapidly do they migrate inwards 
through a massive protoplanetary disc, and what mecha- 
nism halts the migration? 

The subset of these planets that transit their par- 
ent stars are of key importance in addressing these ques- 
tions, because they are the only planets whose radii and 
masses can be determ ined directly. The first such discovery 
iCharbonneau et all 2000) confirmed the gas-giant nature of 
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the hot Jupiters; indeed the inflated radius of HD 209458b 
continues to challenge our understanding of exoplanetary 
interior structure. Subsequent discoveries have presented 
further surprises. The apparent correlation between planet 
mass and minimum survivable orbital separation among the 
known transiting planets offers important clues to the "stop- 
ping mechanism" for inward orbital migrati on of newly- 
forme d giant planets in protoplanetary discs jMazeh et alJ 
I2005F) . The high core mass of the recently-discovered Saturn- 
mass planet orbiting HD 149026 (Sato et al. 2005) presents 
difficulties for models of planet formation via gravitational 
instability. Spitzer observations of the secondary eclipses 
of TReS-lb, HD 209458b and HD189733b have provided 
the first esti mates of the temperatures in these planets ' 
cloud decks llCharbonneau et alJl2005t iDeming et al. 2005; 
iDeming et al]l2006ft . 

To date, however, only ten such planets have been dis- 
covered. Three of the ten have been found via targeted 
radial-velocity searches, the other six through large-scale 
photometric surveys for transit events. The primary goal of 
the SuperWASP project (Pollacco et al. 2006) is to discover 
bright, new transiting exoplanets in sufficiently large num- 
bers that we can place studies of their mass-radius relation, 
and the suspected relationship between minimum orbital 
distance and planet mass, on a secure statistical footing. 
SuperWASP's "shallow-but-wide" approach to transit hunt- 
ing is designed to find planets that are not only sufficiently 
bright (10 < V < 13.0) for high-precision radial- velocity 
follow-up studies to be feasible on telescopes of modest aper- 
ture, but also for detailed follow-up studies such as trans- 
mission spectroscopy during transits, and Spitzer secondary- 
eclipse observations. Only five of the ten presently-known 
transiting planets orbit stars brighter than V = 14 and so 
have such strong follow-up potential. 

Here we present a methodology used in the search for 
exoplanetary transit candidates in data from the first year of 
S uperWASP op e ration . We employ the SysRem algorithm 
of iTamuz et alJ i2005T) to identify and remove patterns of 
correlated systematic error from the stellar light-curves. We 
present a re fined version of the Box Least-Squares (BLS) 
algorithm of iKovacs et alJ (I2002F) . which permits a fast grid 
search and efficient refinement of the most promising so- 
lutions without binning the data. Using simulated transits 
injected into real SuperWASP data we develop a filtering 
strategy to optimise and quantify the recovery rate and 
false-alarm probability as functions of stellar magnitude and 
transit depth. We develop simple plausibility tests for tran- 
sit candidates, using the transit duration and depth to es- 
timate the mass of the parent star and the radius of the 
planet. We mine publicly- available catalogues to obtain the 
V — K colours, proper motions and other properties of the 
host stars. Combined with the transit durations and depths, 
these give improved physical parameters of each system and 
help us to identify the most promising candidates for spec- 
troscopic follow-up observations. 



2 INSTRUMENTATION AND OBSERVATIONS 

The SuperWASP camera array, located at the Observatorio 
del Roque de los Muchachos on La Palma, Canary Islands, 
consists of five 200mm f/1.8 Canon lenses each with an An- 



dor CCD array of 2048 2 13.5/zm pixels, giving a field of view 
7.8 degrees square for each camera. By cycling through a se- 
quence of 7 or 8 fields located within 4 hours of the meridian, 
at field centres separated by one hour right ascension, Su- 
perWASP monitored up to 8% of the entire sky for between 
4 and 8 hours each night during the 2004 observing sea- 
son. The average interval between visits to each field was 6 
minutes. Each exposure was of 30s duration, and was taken 
without filters. 

Between 2004 May and September, the five SuperWASP 
cameras secured light-curves of some 2 x 10 stars brighter 
than V = 13. The long-term precision of the data (deter- 
mined from the RMS scatter of individual data points for 
non-variable stars) is 0.004 mag at V = 9.5, degrading to 
0.01 mag at V = 12.3. Our sampling rate and run duration 
guarantee that 4 or more transits should have been observed 
in 90% of all systems with periods less than 5 days, and 100% 
of all systems with periods less than 4 days, though some 
incompleteness is expected at periods very close to integer 
multiples of 1 day. 



3 DATA REDUCTION 

The data were reduced using the SuperWASP pipeline. The 
pipeline carries out an initial statistical analysis of the raw 
images, classifying them as bias frames, flat fields, dark 
frames or object frames. The bias frames are combined using 
optimally weighted averaging with outlier rejection. Auto- 
mated sequences of flat fields secured at dawn and dusk are 
corrected for sky illumination gradients and combined using 
an optimal algorithm that maps and corrects for the pattern 
introduced by the finite opening and closing time of the iris 
shutter on each camera. 

Science frames are bias-subtracted, corrected for shut- 
ter travel time and corrected for pixel-to-pixel sensitivity 
variations and vignetting using the flat-field exposures. Flat 
fields from different nights are combined using a algorithm 
in which the weights of older flat field frames decay on a 
timescale of 14 days. A catalogue of objects on each frame is 
constructed using ext ractor, the Starlink imp lementation 
of the SExtractor dBertin fc Arnoutsl Il99tfl source de- 
tection software. An automated field recognition algorithm 
identifies the objects on th e frame with th eir counterparts 
in the tycho-2 catalogue l|Heig et alJl200fl) and establishes 
an astrometric solution with an RMS precision of 0.1 to 0.2 
pixel. 

Aperture photmetry is then carried out at the positions 
on ea ch CCD image of a ll objects in the USNO-B1.0 cata- 
logue dMonet et al.l2003T) with second-epoch red magnitudes 
brighter than 15.0. Fluxes are measures in three apertures 
with radii of 2.5, 3.5 and 4.5 pixels. The ratios between the 
fluxes in different pairs of apertures yield a "blending in- 
dex" which quantifies image morphology, and is used to flag 
blended stellar images and extended, non-stellar objects. In- 
dividual objects are allocated SuperWASP identifiers of the 
form "1SWASP Jhhmmss.ss + ddmmss.s" , which are based 
on their USNO-B1.0 coordinates for equinox J2000.0 and 
epoch J2000.0. 

The resulting fluxes are corrected for primary and sec- 
ondary extinction, and the zero-point for each frame is tied 
to a network of local secondary standards in each field, 
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Figure 1. Probability of observing more than Nt transit events 
in the 2004 data from the field centred at RA = Olh 43m, Dec 
= +31° 26', as a function of orbital period. For periods less than 
5 days, the probability of observing 3 or more transits is at least 
50 percent, except at periods close to integer and half-integer 
numbers of days. 



whose magnitudes are derived from WASP fluxes trans- 
formed via a colour equation relating instrumental magni- 
tudes to TYCHO-2 V magnitudes. The resulting fluxes are 
stored in the SuperWASP Data Archive at the University 
of Leicester. The corresponding transformed magnitudes for 
all objects in each field are referred to as "WASP V" mag- 
nitudes throughout this paper. 



4 SYSTEMATIC ERROR REMOVAL 

The reduced data from the pipeline inevitably contain low- 
level systematic errors. In this section we describe a coarse 
initial d ecorrelation and ap plication of the SysRem algo- 
rithm of lTamuz et alJ <l200fj) . Before modelling and remov- 
ing patterns of correlated error we perform a coarse initial 
decorrelation by referencing each star's magnitude to its own 
mean, removing small night-to-night and frame-to- frame dif- 
ferences in the zero-point, and measuring the additional, 
independent variance components introduced in individual 
stars by their intrinsic variability and in some observations 
by patchy cloud. 



4.1 Coarse decorrelation 

We start with a two-dimensional array rriij of processed stel- 
lar magnitudes from the pipeline. The first index i denotes a 
single CCD frame within the entire season's data. The sec- 
ond index labels an individual star. We compute the mean 
magnitude of each star: 



(i) 



incorporate both the formal vari- 



where the weights 
ance afj calculated by the pipeline from the stellar and sky- 
background fluxes, and an additional systematic variance 
component a^u) introduced in individual frames by passing 
wisps of cloud, Sahara dust events and other transient phe- 
nomena which degrade the extinction correction: 



3.1 Sample selection and survey completeness 

The field chosen for development and testing of the transit 
search algorithm is centred at 02000 = Olh 43m, $2000 = 
+31° 26'. A search of the WASP archive, centred on this 
position and covering the full 7.8°-square field of view of 
the camera, yielded light-curves of 7840 stars brighter than 
WASP V = 13 for which a catalogue query indicated that 
the light-curves comprised more than 500 valid photometric 
data points. Indeed most of the objects in this field had more 
than 2000 valid photometric measurements. The maxipara 
2 mum number of valid observations in any light-curve was 
2301. The resulting set of light-curves was loaded into a 
rectangular matrix of 7840 light-curves by 2301 observations 
for further processing. 

The expected number of transits present in the observed 
light-curve of any given star depends on the sampling pat- 
tern of the observations and the period and phase of the 
transit cycle. In Fig. Q we plot the probability of Nt or more 
transits being present in the data, as a function of orbital pe- 
riod. We consider a transit as having been "observed" if data 
have been obtained within the phase ranges <f> < Q.lw/P or 
<f> > 1 — O.lw/p, where w is the expected transit duration 
as described at the start of Section and P is the orbital 
period. The regular sampling pattern on most nights of ac- 
ceptable quality generally ensures that at least 40 percent 
of a given transit event must be well-observed if it is to 
be counted according to this criterion. For the field studied 
here, the prospects of observing at least three transits is 70% 
or better at periods less than 3.5 days. 



(2) 



t(i) 



Data points from frames of dubious quality are thus down- 
weighted. The weight is set to zero for any data point nagged 
by the pipeline as either missing or bad. 

The zero-point correction for each frame i follows: 



Zl = 

In this case the weights are defined as 
1 



(3) 



(4) 



where cr^/j) is an additional variance caused by intrinsic stel- 
lar variability. This down-weights variable stars in the cal- 
culation of the zero-point offset for each frame. 

Initially we set of^ = ffjy) = 0, and compute the aver- 
age magnitude rhj for every star j and the zero-point offset 
Zi for every frame. 

To determine the additional variance of ofyj for the 
intrinsic variability of a given star j from the data them- 
selves, we use a maximum-likelihood approach. We define a 



data vector X = {mi 



l...n} containing the light-curve 



of star j, and a model ft = {rhj + zi,i = l...n}. If both 
sets of errors are gaussian, then the probability of the ith 
individual observation is 



P{Xi\pk 



/2tt 



exp j 



(rriij — rhj — z 
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The likelihood of the entire data vector for star j given 
the model is 



L(/2) = (2rr) 



where 



-n/2 



n 



(rriij — rhj — Zi) 2 



exp 



2 X 



E 



Taking logs and differentiating, we find that the 



maximum-likelihood value of a. 



satisfies 



da 



2a 



Hi) 



E 



4 + °m + °s U ) 



E 



(rriij — rhj 



We solve the equation 

1 \ ^ (rriij — mj — Zi) 2 



E 



-E 



= o 



iteratively for each a 2 ^^, holding the &t(i) fixed. 

We then perform an analogous calculation, summing 
over all the stars in the ith frame and solving 



E 



E 



( m i. 







for each a 2 ^ holding the a 2 ^ fixed. 

At this stage we refine the mean magnitude per star, 
the zero point offsets and the additional variances for stellar 
variability and patchy cloud, by iterating Eqs. ©, © 
and © to convergence. 

The coarsely decorrelated differential magnitude of each 
star is then given by 



4.2 Further decorrelation with SysRem 

Some of the many sources of systematic error that affect 
ultra-wide-field photometry with commercial camera lenses 
are readily understood and easily corrected, while others 
are less easy to quantify. For example, the SuperWASP 
bandpass spans the visible spectrum, introducing significant 
colour-dependent terms into the extinction correction. The 
pipeline attempts to calibrate and remove secondary extinc- 
tion using TYCHO-2 B — V colours for the brighter stars, 
but uncertainties in the colours of the TYCHO-2 stars and 
the lack of colour information for the fainter stars means 
that some systematic errors remain. Bright moonlight or re- 
duced transparency arising from stratospheric Sahara dust 
events reduce the contrast between faint stars and the sky 
background, altering the rejection threshold for faint sources 
in the sky-background annulus and biassing the photometry 
for faint stars. The SuperWASP camera lenses are vignetted 
across the entire field of view, and the camera array is not 
autoguided, so polar-axis misalignment causes stellar images 
to drift by a few tens of pixels across the CCD each night, ft 
is possible that temperature changes during the night could 
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Figure 2. Upper panel: RMS scatter versus magnitude prior to 
decorrelation with SysRem. The upper curve shows the RMS 
scatter of individual data points in the light-curves of the non- 
variable stars in the ensemble. The middle curve shows the scatter 
in the same light-curves after performing a moving weighted av- 
erage over all complete 2.5-hour intervals within each night. The 
lower curve shows the RMS scatter of the individual data points 
divided by the square root of the average number of points (typ- 
ically 22) in a 2.5-hour interval. The correlated noise amplitude 
among the brightest stars is typically 0.0025 mag. Lower panel: 
Covariance spectral index b as a function of V magnitude prior 
to decorrelation with SysRem. Pure uncorrelated (white) noise 
should give b = —0.5, while pure correlated noise should give 
6 = 0. We see that the effects of correlated noise are most pro- 
nounced for the brightest stars. Even the faintest stars are affected 
to some extent. 



affect the camera focus, changing the shape of the point- 
spread function across the field and biassing the photometry 
for fainter stars. 

These systematic errors, and no doubt others as-yet 
unidentified, ha ve a se r ious i mpact on the detection thresh- 
old for transits. iPonti Q2006) discussed methods of charac- 
terising the structure of the covariance matrix for a given 
stellar light-curve. The first and simplest method is to carry 
out boxcar smoothing of each night's data, with a smoothing 
length comparable to the typical 2.5-hour duration of a plan- 
etary transit. For every set of L points spanning a complete 
2.5-hour interval interval starting at the fcth observation, we 
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Figure 3. Upper panel: RMS scatter versus magnitude after re- 
moval of the four strongest correlated error components using 
SysRem. The curves are defined as in Fig. [5] The correlated noise 
amplitude of the binned data is reduced to 0.0015 magnitudes 
for the brightest stars. Lower panel: Covariance spectral index b 
as a function of V magnitude after removal of 4 correlated error 
components using SysRem. While some effects of correlated noise 
remain for the brightest stars, stars fainter than V = 11.0 have 
covariance spectral indices close to the value b = —0.5 expected 
for white noise. 

construct an optimally- weighted average magnitude 

Ek + L-l „ 

m * = '-fc+L-i . 

E l=fe Wi 

with bad observations down-weighted as above using Wi = 

VK 2 + <,))■ 

The RMS scatter tinned in the smoothed light- 
curve of rrifc values is then compared to the RMS scatter 
°unbinned °^ ^ ne individual data points. For uncorrelated 

noise, we expect chinned = "unbinned/^^' wnere L ls the 
average number of observations made in a 2.5-hour interval. 
In Fig. H we plot o- unbinned , cr binned and v unhinned /VL 
as functions of V magnitude. For clarity, we exclude all ob- 
viously variable stars having v/^fy) > 0-005 magnitude (as 
defined in Eq.[IJl. The RMS scatter in the binned data is typ- 
ically 0.0025 magnitude for the brightest non-variable stars, 
far worse than the 0.0008 magnitude that would be achieved 
if the noise were uncorrelated. 

The covariance structure of the correlated noise is quan- 



tified by the power-law dependence of the RMS scatter on 
the number of observations used in the boxcar smoothing: 

"binned — °unbinned ■ 

For completely uncorrelated noise we expect b = —1/2, while 
for completely correlated noise (e.g. from intrinsic large- 
amplitude stellar variability on timescales longer than the 
longest smoothing length considered but shorter than the 
data duration) we expect the RMS scatter to be indepen- 
dent of the number of data points, giving b = 0. We measure 
b for each star using the incomplete smoothing intervals at 
the start and end of the night. We create a set of binned 
magnitudes obtained for L — 1, 2, 3, ... consecutive observa- 
tions. The RMS scatter in the binned magnitudes for each 
value of iV is then plotted as a function of L and a power-law 
fitted to determine b. In Fig. |2]we plot b as a function of V 
magnitude, again excluding intrinsic variable stars having 
\J^tj) > 0.005 magnitude. As expected, we find the effects 
of correlated noise to be most pronounced for the brightest 
non-variable stars. Even at our faint cutoff limit of V = 13, 
however, we do not fully recover the uncorrelated noise value 
6 = -0.5. 

We use the SysRem algorithm of lTamuz et alJ i2005|) to 
identify and remove patterns of correlated noise in the data. 
The reader is referred to that paper for details of the im- 
plementation. The SysRem algorithm produces a corrected 
magnitude Xij for star j at time i, given by 

M 

~ . . _ _ \ CO (fc) 
k=l 

where M represents the number of basis functions (each rep- 
resenting a distinct pattern of systematic error) removed. 
An interesting property of the SysRem algorithm is that 
the inverse variance weighted mean value of each basis func- 
tion, multiplied by the corresponding stellar coefficient, is 
so close to zero for all but the most highly variable stars 
that it is not necessary to repeat the coarse decorrelation. 
The inverse variance weighted mean change in the zero-point 
of each frame is generally less than 0.001 magnitude, again 
rendering further coarse decorrelation unnecessary after the 
final application of SysRem. 

As described by Tamuz et al, we find several distinct 
basis functions ai representing patterns of correlated sys- 
tematic error in the data which affect every star in the field 
to an extent quantified by the coefficents Cj. In Fig.|l|we plot 
the first four basis functions against hour angle, after fold- 
ing the entire season's basis-function values on a period of 1 
sidereal day. The first and strongest basis function produced 
by SysRem shows a generally smooth night-to-night varia- 
tion with a characteristic timescale of order 10 days. 
Superimposed on this large-amplitude, long-timescale vari- 
ation is a small-amplitude, linear trend through each night. 
Neither variation appears to be related directly to either the 
lunar cycle or to transparency losses caused by intermittent 
Sahara dust events, but the stellar coefficients ^'cj show a 
strong correlation with magnitude at V > 12. We infer that 
this systematic error component may be related to a combi- 
nation of sky brightness and atmospheric transparency that 
could affect the rejection threshold for faint stellar images 
in the sky aperture, leading the pipeline to underestimate 
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Figure 4. The first four SysRem basis functions are plotted as a function of hour angle after being folded on a period of 1 sidereal day. To 
give an idea of the magnitude correction applied to a typical star, each basis function is scaled by the RMS scatter of the corresponding 
stellar coefficients. 



the brightness of faint stars in bright moonlight and/or poor 
transparency. 

The second and fourth basis functions resemble half- 
sinusoids, 90 degrees out of phase, when plotted modulo 
sidereal time. Both the secondary extinction and the flat- 
field vignetting correction are expected to vary as functions 
of sidereal time. As discussed above, SuperWASP is not au- 
toguided, and a small misalignment of the polar axis causes 
stellar images to re-trace the same path across the CCD, a 
few tens of pixels long, every sidereal day. The second and 
fourth basis functions are probably a linear combination of 
these two effects. The corresponding stellar coefficients ^ 2 'Cj 
and Cj should be correlated with departures from the stel- 
lar colour used for the extinction modelling, and with the 
gradient of the residuals in the vignetting function along 
the trajectory of each stellar image. The third basis func- 
tion is a linear trend through the night, generally increasing 
but occasionally decreasing. The origin of this component is 
not obvious, but one possibility is that it could arise from 
temperature-dependent changes in the camera focus through 
the night. 

The fifth and higher basis functions gave significantly 



smaller changes in \ 2 than the first four, and their form 
appears to represent mainly stochastic events affecting only 
a few points in the light-curves of a relatively small number 
of stars. To avoid the danger of removing genuine stellar 
variability, we modelled the global systematic errors using 
only the first four SysRem basis functions. 

In Fig. we show the RMS-magnitude diagram and 
covariance index b as functions of V magnitude after pro- 
cessing with SysRem. We find that for stars fainter then 
V — 11.0 the noise in the corrected light-curves is almost 
uncorrelated. For brighter stars some residual evidence of 
correlated noise remains. On the 2.5-hour timescale of a typ- 
ical transit, however, the RMS amplitude of the correlated 
noise component is reduced to values of order 0.0015 mag- 
nitude with the help of SysRem. 



5 HYBRID TRANSIT-SEARCH ALGORITHMS 

We u se an adaptation of the BLS algorithm jKovacs et alJ 
2002) for the initial search. We set up a coarse search grid 
of frequencies and transit epochs. The frequency step is 
such that the accumulated phase difference between suc- 
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cessive frequencies over the full duration of the dataset cor- 
responds to the expected width of a transit at the longest 
period searched. A set of transit epochs is defined at each 
frequency, at phase intervals equal to the expected transit 
width at that frequency. The expected transit duration is 
computed from the orbital frequency using Kepler's third 
law assuming a stellar mass of 0.9 Mg. Since the majority 
of the main-sequence stars in the magnitude range of interest 
have masses between 0.7 and 1.3 Mq and the transit dura- 
tion at a given period scales as (M* /Mq) 2 ^ 3 , the predicted 
transit duration is unlikely to be in error by more than 20 
to 30 percent even at the extremes of the mass range. 

At each trial period and epoch, The transit depth and 
goodness-of-fit statistic y 2 are calculated using a variant of 
the optimal fitting methods of Kovacs et al, reformulated 
such that the goodness-of-fit criterion has the dimensions of 
the y 2 statistic. A sim ilar app roach has also been used by 
lAigrain fc Irwinl (120041) and bv lBurke et"aH J20051) . 

After processing with SysRem, the light-curve of a 
given star comprises a set of observations Xi with associated 
formal variance estimates erf and additional, independent 
variances of^ to account for transient spatial irregulari- 
ties in atmospheric extinction. We define inverse-variance 
weights 



given by 



and subtract the optimal average value 
x. We also define 



to obtain Xi = cc. 



2 

XiWi 



summing over the full dataset. Note that the weights defined 
here include the independent variance component • This 
has the effect of lowering the significance of high-amplitude 
variable stars, but has little effect on low-amplitude variables 
such as planetary transit candidates. 

In the BLS method, the transit model is characterised 
by a periodic box function whose period, phase and dura- 
tion determine the subset £ of "low" points observed while 
transits are in progress. In many implementations the parti- 
tionin g of the data can be th e slowest part of the BLS proce- 
dure ijAierrain fc Irwinll2004) . We achieve substantial speed 
gains by computing the orbital phase <j> of each data point 
and sorting the phases in ascending order together with 
their original sequence numbers. We partition the phase- 
ordered data into a contiguous block of out-of-transit points 
for which w/2P < <j> < 1 — w/2P, where w is the transit 
duration and P is the orbital period, and the complement 
of this subset comprising the in-transit points. The summa- 
tions that follow use the phase-ordered array of sequence 
numbers to access the in-transit point s. 

Using notation similar to that of iKovacs et al.l l|20 
we define 

s = 'S^x i w i , r = ^Wi, q = S~] x 2 u>j. 



The mean light levels inside (L) and outside(ff) transit are 



r t — r 



with associated variances 

Var(L) = i Vax( J ff) = ^ : . 
The fitted transit depth and its associated variance are 



S = L- H 



st 



Var(<5) 



t 



r(t — r) ' ' w r(t — r) ' 
so the signal-to-noise ratio of the transit depth is 



S/N = s 



t 



r(t — r) 



The improved fit to the data is given by y 2 = Xo ~ Ax 2 , 
where the improvement in the fit when compared with that 
of a constant light curve is 



Ax 2 



r(t — r) 



Note also that Ay 2 



{S/N) 2 . The goodness of fit 



to the portions of the light-curve outside transit, where the 
light level should be constant, is 



2 2 s 

Xh = Xo 



(t-r) 



The best-fitting model at each frequency is selected, and the 
corresponding transit depth, Ay 2 and y\ are stored for each 
star. 



5.1 Selection of potential candidates 

Following an initial, coarse application of the BLS algorithm, 
we filter the candidates by rejecting obviously variable stars 
for which the post-fit y 2 > 3.5N, where N is the number 
of observations. We reject stars for which the best solution 
has fewer than two transits. We also reject those best-fit so- 
lutions for which the phase-folded light curve contains gaps 
greater than 2.5 times the expected transit duration. Such 
solutions arise in a small number of stars that suffer from 
errors in the vignetting correction near the extremities of 
the field of view, or from transient dust motes in the op- 
tics that are not completely flat-fielded out. Stellar images 
drift across the image by a few dozen pixels during a typi- 
cal night, because SuperWASP is unguided and has a small 
misalignment of the polar axis. The drift pattern in the light 
curve recurs on a period of one sidereal day and its form de- 
pends strongly on location. It is significant mainly around 
the edges of the chip and near transient dust-ring features 
in the flat field. The SysRem algorithm is not effective at 
removing this type of variation from the small number of 
stars affected, so we reject them at this stage in the analy- 
sis. There is sufficient overlap between cameras that there is 
a high probability of the same star being recorded in a less 
problematic part of an adjacent camera's field of view. 

We select candidates for finer analysis by making cuts 
on two light-curve statistics, in both of which we expect stars 
showing periodic transit signals to lie well out in one tail of 
the distribution. The first is the "signal-to-red noise" ratio 
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S re( ^ of the best-fit transit depth to the RMS scatter binned 
on the expected transit duration: 



red 



As in Sect. l4.2l above. L is the average number of data points 
spanning a single transit, b is the power-law index that quan- 
tifies the covariance structure of the correlated noise, N t is 
the number of transits observed, S is the transit depth and 
a is the weighted RMS scatter of the unbinned data. Since 
the transit depth S is a signed quantity, this statistic distin- 
guishes periodic dimmings from periodic brightenings. The 
second s tatistic is the "ant i-transit ratio" A^ 2 /Ax- pro- 
posed bv lBurke et, alJ l|2005ll . being the ratio of the strongest 
peak in the periodogram of <5^ 2 that corresponds to a dim- 
ming, to the strongest peak corresponding to a brightening. 
We adopt conservative thresholds, requiring S le( i < — 5 and 
Ax 2 /Ax?_ > 1.5. Note that Burke et al use a threshold of 
2.75 for final selection on the latter statistic. 

Even this relatively loose set of selection criteria elimi- 
nates 95.5 to 97.5 percent of all the stars in the sample, typ- 
ically leaving 100 to 200 surviving objects worthy of more 
detailed study from an initial sample of several thousand 
stars. 

In order to ensure that the most obvious transit candi- 
dates are not rejected by the filtering, we injected synthetic 
patterns of transits, with randomly-generated periods and 
epochs, into the light-curves of 100 randomly-chosen stars in 
the test dataset. The transits were given depths of 0.02 mag- 
nitude, and their durations were again computed from the 
orbital frequency using Kepler's third law assuming a stellar 
mass of 0.9 M©. The synthetic transit signatures were added 
to the actual data, thus preserving the noise properties of 
the observations. 

In Fig. 0we plot the anti-transit ratio against Swi for 
all stars in the dataset with positive values of SLgj. The 100 
stars for which synthetic transits were injected are denoted 
by crosses, and the remainder by dots. Those stars selected 
for further study are circled, confirming that the preselec- 
tion procedure captures a set of objects that includes nearly 
all candidates with significant transit signals. Those that 
were not selected were either too faint and noisy to yield a 
significant detection, or were superimposed on light-curves 
of intrinsically variable stars that failed to satisfy the other 
selection criteria. 



5.2 Refinement of transit parameters 

The reduced sample is again subjected to a BLS search, this 
time utilising a finer grid spacing in which the frequency step 
is chosen to give a phase drift over the entire data train that 
is no more than half the expected transit width. The grid 
spacing in epoch is set at half the transit width. For each 
star in the sample we identify the five most significant peaks 
that correspond to transit-like dimmings, and the three most 
significant peaks that correspond to brightenings. (Select- 
ing the three most significant peaks at the resolution of the 
grid search suffices to capture the strongest peak reliably 
after Newton- Raphson refinement, but for dimmings we are 
interested in possible aliases, so we examine the five most 
significant peaks). 

Having identified a subset of objects in which 




-15 -10 -5 

Transit depth / Binned RMS 



Figure 5. A scatterplot of the anti-transit ratio against the signal 
to red noise ratio for stars in the 0143+3126 field shows that the 
majority of non-variable stars yield anti-transit ratios less than 
2.0, and spurious best-fit signal to red noise ratios between -3.0 
and -6.0. Crosses denote the 100 stars for which synthetic transits 
with depths of 0.02 magnitude were injected. Stars that satisfy 
the initial selection criteria for refined analysis are circled. 



statistically-significant transit-like signals may be present, 
we next refine the transit parameters. Instead of using a pure 
box functi on we adopt a softened bo x-like function fi(ti) de- 
veloped bv lProtopapas et al] (I2005T) : 



fj,(ti) — —5 ftanh 



<t P + -) 



+ tanh 



to approximate the light-curve at time f,;, where 

Psin[7r(t-r )/P] 
ivr) 

Here To is the epoch of mid-transit, P is the orbital period, 
8 is the transit depth, r\ is the transit duration and c is a 
softening parameter that determines the duration of ingress 
and egress in the model transit. 

This function has the advantage that it is analytically 
differentiable with respect to the key transit parameters To, 
P, w and S, which means that we can refine them quickly 
and efficiently using a Newton-Raphson approach based on 
the derivative functions dfi/dTo, d\ijdP and dfi/dr/. 

For an estimated set of parameters To, P, r\ we fit the 
transit depth by defining a basis function 

Pi = dfj,(ti)/dS = n(U)/5 

and, for each observation Xi, determine the optimal scaling 
factor to fit the light-curve: 



J2i( x i - x)(jpi-p)wi 

E,(p» -p) 2 Wi 



(7) 



where x = J2i X ^ w i/Hi w i and P = J2iPi w ^/ J2i w i- 

At this stage in the analysis we omit the variance com- 
ponent o" 2 (j) due to stellar variability, but retain the patchy- 
extinction contribution &t(i) so that poor quality data are 



down-weighted correctly: 



af + a 



t(i) 



We refine the model parameters To, P and r\ in turn. 
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To refine the epoch of transit, for instance, we subtract the 
current model from the data to obtain 

Vi=Xi- n(ti), 
and define the basis function 

qi = d(i(ti)/dT . 
We then determine the optimal scaling factor 



dT 



(8) 



where y = Y^iVi w i/Y^i w i and 9 = J2i 9W/ J2i w i- 

The new estimate of the transit epoch is To + dTo, so 
we re-compute /i and redetermine the transit depth using 
Eq. The period P and transit width r\ are refined in turn 
using procedures exactly analogous to Eq.[£] The parameter 
values converge rapidly to the optimal solution after a few 
iterations. 

At this stage the phased light-curves of the best-fit so- 
lutions are inspected visually to pick out those candidates 
showing clear transit-like signatures. Candidates identified 
in this way from the 0f43+3f26 field are listed in Table Q 
and a representative selection of their phased light-curves 
are shown in Fig. |S| Some of these are clearly eclipsing bina- 
ries, exhibiting secondary eclipses, out-of-transit variability, 
or both. Further physical characterisation is needed to dis- 
tinguish plausible planetary transit candidates from proba- 
ble stellar impostors. 



6 CHARACTERIZATION OF CANDIDATES 



Both theory (iBrownl I2003T) and the experience of pre- 
vious transit follow-up campaign s (Alonso ct aD [2004: 
iBouchv et a.l.l200l:IPont et a.l j200.4IO'Donova.n et all200fft 

indicate that among our transit candidates, stellar binaries 
will outnumber genuine planetary transits by an order of 
magnitude. Some are grazing eclipsing binaries; others are 
multiple systems in which the light of a stellar eclipsing pair 
is diluted; others still have low-mass stellar or brown-dwarf 
companions whose radii are similar to those of gas-giant 
planets. In order to mitigate the false-alarm rate our candi- 
dates must pass a number of tests before being considered 
as high-priority spectroscopic targets. 



6.1 Ellipsoidal variations 

iDrakd (12003!) and lSirko fc PaczvriskH i2003T) pointed out that 
ellipsoidal variables can be rejected with a high degree of 
certainty as stellar binaries. Detached stellar binaries with 
orbital periods of 1 to 5 days can easily be mistaken for 
transiting exoplanet systems if they exhibit either grazing 
eclipses or deeper eclipses diluted by the light of a blended 
third star. In either case, the equipotential surfaces of the 
two stars will often be sufficiently ellipsoidal to yield de- 
tectable out-of-transit variability. 

Since the data have already been partitioned into a sub- 
sets of points inside and outside transit, we approximate the 
ellipsoidal variation with phase angle Qi as 



to the out-of-transit residuals Xi — H. We sum over the subset 
h of points outside transit to define 

u = }X Xi ~ H)pjWj, v — y^Pi Wj 

h h 

and so obtain both the amplitude and formal variance 



u 

e = -, 
v 



Var(e) = - 



of the ellipsoidal variation. The signal-to-noise ratio of the 
amplitude is 



Pi 



cos 26i 



Any candidate for which the sign of e indicates that the 
system is significantly brighter at quadrature than at con- 
junction, with S/N> 5 or so, is noted as a probable stellar 
impostor. In Table^we see that two of the possible transit- 
like candidates identified by eye are disqualified in this way. 

The 5a detection threshold for ellipsoidal variation is 
nearly independent of orbital period, but ranges from to 
0.003 magnitude at V = 12.5 to 0.001 mag at V = 10 for the 
dataset studied here. The expected amplitude of ellipsoidal 
variation depends on the ratio of the primary radius to the 
orbital separation, and on the mass ratio of the system. 

The SuperWASP transit search yields many objects in 
which a solar-type star appears in a 1 to 2-day orbit about a 
companion with a radius of 1 to 2 Rj U p- Simple projected- 
area calculations based on a standard Roche equipotential 
surface model for a main-sequence star of 1 Mq with a 0.2 
Mq companion in a 1.5-day orbit yield an ellipsoidal varia- 
tion of order 0.002 magnitude, which should be detectable 
with high significance in an isolated system with V < 11 
or so. Companions less massive than this may not yield de- 
tectable ellipsoidal variations, but are sufficiently interesting 
in their own right to be worth following up. 

Ellipsoidal variability can also help to eliminate impos- 
tor systems where a bright foreground star is blended with a 
background eclipsing binary. If one component of an eclips- 
ing stellar binary is evolved, such as in an RS CVn system 
with a 2.5 to 3.0 Rq K subgiant and a solar-type main- 
sequence star in a 3-day orbit, the ellipsoidal variation can 
be as great as 0.02 to 0.05 magnitude. If such a system ex- 
hibits a partial primary eclipse 0.1 to 0.3 magnitude deep 
and a shallow secondary eclipse, and is blended with a fore- 
ground star 2 to 3 magnitudes brighter, it can mimic an 
exoplanetary transit. The ellipsoidal variation is thus great 
enough to remain detectable even when diluted by a blend 
2 or 3 magnitudes brighter. 



6.2 Stellar mass and planet radius 

ISeager fc Mallen-Ornelasl l)2003h and iTinglev fc Sackettl 
(2005) have developed methods for deriving the physical pa- 
rameters of transit candidates based on light-curve parame- 
ters alone. Seager & Mallen-Ornelas use the orbital period, 
the total transit duration, the duration of ingress and egress 
and the transit depth to derive the impact parameter, the 
orbital inclination, the stellar mass, the planet radius and 
the orbital separation. Because the duration of ingress and 
egress are difficult to measure reliably in noisy data, we de- 
fine a simplified consistency test predicated on the assump- 
tion that the orbital inclination is close to 90 degrees. Our 
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1 1 1 1 1 1 1 1 1 1 o 

-0.4 -0.2 0.2 0.4 g 

I 

Ph a :f.- 



Star: 1 SWASP J0 1 4400.22 + 344449.2 
Width(h)= 3.168 Depth= -0.0273 Ntrans= 5 



0.2 0.4 0.6 0.8 1 

Frequency 
Star 545 f=0. 489571 Ax 2 =1252.611 




d - 



-0.4 -0.2 0.2 0.4 I . 

Phase 

o - 

Star: 1 SWASP J0 1 57 1 1 .29 + 303447.7 <f " 

Width(h)= 2.304 Depth= -0.0155 Ntrans= 9 

0.2 0.4 0.6 0.8 1 

Frequency 



Figure 6. Representative light curves and periodograms of four objects in the 0143+3126 field exhibiting transit-like behaviour. The 
fourth object, 1SWASP J014228. 76+335433. 9, shows clear ellipsoidal variability outside transit, indicating a stellar binary. 



aim is simply to estimate the mass of the star and the radius 
of the planet, and thereby to determine whether the transits 
could plausibly be caused by a roughly Jupiter-sized planet 
orbiting a star of roughly solar mass and radius. 

Once an object is found to display transit-like events 



(characterised by a flat light-curve outside eclipse, no sec- 
ondary eclipse and a transit depth < 0.1 mag) we search the 
USNO-B1.0 catalogue for blends less than 5 mag fainter, 
located within the 48-arcsec radius of the 3.5-pixel photo- 
metric aperture. We estimate a main-sequence radius and 
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Star: 1SWASP J0 1 5625.53 + 29 1 432.5 
Width(h)= 3.216 Depth= -0.0124 Ntrans= 13 
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Figure 6 



Frequency 

continued 



mass from a V — K colour index derived from the Super- 
WASP V magnitude and the 2MASS K magnitude. We in- 
fer both the expected transit duration and the radius of the 
putative planet using the simplified mass-radius relation of 
iTinglev fc Sacketd J2005I) . 

R* ~ Mt /5 . 



The si ze of the planet follows fro m the approximate expres- 
sion of ITinglev fc Sacketd i200ijt) for the transit depth S for 
a limb-darkened star: 

Rp _ f~T 
ii* V 1.3' 

High-priority candidates must display multiple transits, 
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Table 1. Transit candidates identified by eye from 0143+3126 field. Quantities that disqualify a candidate from further consideration 
are shown in boldface. IDs of disqualified candidates arc shown in parentheses. 



SuperWASP ID 


•^red 


.5 


Duration (h) 


Epoch 


Period 


Aft 


(S/N) ellip 


(1SWASP 


J015721. 21+333517. 9) 


27.4 


0.0518 


1.704 


31 82 1 328 


4.14907 


2 


2.1 


1SWASP 


J013901. 75+333640.6 


21.2 


0.1781 


2.04 


31 80 9563 


3 440225 


5 


3.6 


1SWASP 


J015711. 29+303447.7 


12.3 


0.0155 


2.304 


3182.5613 


2.042603 


9 


1.3 


(1SWASP 


J014228.76+335433.9) 


13.4 


0.0385 


3.648 


3180.7258 


2.02348 


12 


16.5 


1SWASP 


J014400.22+344449.2 


11.9 


0.0273 


3.168 


3180.2839 


3.719753 


5 


4.4 


1SWASP 


J015625.53+291432.5 


12.2 


0.0124 


3.216 


3182.5415 


1.451347 


13 


4.9 


1SWASP 


J014212.56+341534.4 


11.3 


0.0632 


2.904 


3182.0596 


4.305729 


6 


4.7 


1SWASP 


J014211. 84+341606. 5 


11.1 


0.0544 


3.072 


3182.0503 


4.30604 


6 


4.2 


(1SWASP 


J012536.11+341423.8) 


8.9 


0.0623 


2.52 


3181.6191 


1.891481 


i 


9.5 


1SWASP 


J014549.24+350541.9 


6.9 


0.0081 


2.256 


3182.407 


1.452465 


9 


0.6 


1SWASP 


J014700.48+280243.6 


6.7 


0.008 


4.32 


3182.1853 


1.68013 


13 


0.5 



a fitted transit duration no more than 1.5 times more or less 
than the predicted value, a transit depth indicating a plan- 
etary radius less than 1.6 Jupiter radii, and have no blends 
less than 3 magnitudes fainter located within the 48 arc- 
sec photometric aperture. Their proper motions (from the 
Hipparcos, TYCHO-2 or USNO-B1.0 catalogues) and V — K 
colours must also be consistent with main-sequence stars 
rather than giants, the luminosity class b eing inferred from 
the r educed proper-motion method of iGould fc Mor^ari 
2003) and the giant-dwarf separation method of lBilir etal 

■ 2333). 



In Table [5] we list the effective temperatures, spectral 
types and stellar radii estimated from the V — K colour 
indices, together with the inferred planet radius and the 
ratio rj of the observed to the expected transit duration. 
The e ffective temperatures are derive d using the calibra- 
tion of lBlackwell fc Lvnas-Oravl (|l994h . and the radii using 
the interfer ometrically-dete r mined colour-surface brightness 
relations of iKervella et all ll2004l) together with Hipparcos 
parallaxes, where available. Two objects are rejected imme- 
diately, because brighter stars are found within the radius 
of the photometric aperture. Both stars found previously to 
exhibit significant ellipsoidal variations yield inferred com- 
panion radii 2.46 and 1.63 Rj U p, substantially greater than 
expected for gas-giant planets. Four of the remaining candi- 
dates are rejected on the same grounds. 



Of the original eleven candidates, three re- 
main. 1SWASP J015625. 53+291432. 5 and 1SWASP 
J014549.24+350541.9 are mid-K stars, for which the in- 
ferred companion radii are substantially less than that of 
Jupiter. Both of these have stars less than 5 magnitudes 
fainter located within the photometric aperture, so further 
follow-up is warranted to eliminate the possibility that 
the blended stars could be eclipsing binaries. The transit 
detection in 1SWASP J014549.24+350541.9 is of rather 
marginal significance, with Sj-gj = 6.93. The brightest 
of the three candidates, 1SWASP J015711. 29+303447. 7, 
appears to be an F6 star with a 1.38 Rj U p companion. 



7 DISCUSSION AND CONCLUSIONS 

In this paper we have described the methodology that we 
have adopted for searching for transits in the large body 
of data produced by the SuperWASP camera array on La 
Palma during its first few months of operation, and applied 
it to the 7840 stars brighter than V = 13.0 in a survey field 
centred at RA Olh 43m, Dec +31° 26'. 

We find that an initial search using the BLS method 
on a coarse grid of transit epochs and orbital frequencies 
is sufficient for us to eliminate more than 95 percent of 
the stars searched. This allows us to perform a finer grid 
search on only the remaining 198 stars in the field un- 
der consideration. We have adapted the ana lytic Newton- 
Raphson method of IProtooapas et, alJ l|2005h to refine the 
orbital solutions around periodogram peaks found with the 
BLS method. This method quickly yields the depth and du- 
ration of the transits, and the frequency and phase of the 
photometric orbit, while keeping the dimensionality of the 
search grid (and hence the processing time required) as low 
as possible. 

W e use the ellipsoidal-varia t ion m ethodology of iDrakel 
l|2003fl and ISirko fc Paczvriskl l|2003h in our light-curve 
modelling to eliminate probable stellar binaries. We use 
the publicly-available 2MASS and USNO-B1.0 catalogues 
to obtain colours and proper motions for candidates, and 
to estimate the stell ar and planetary radii u s ing m eth- 
ods similar to those ofpeaeer fc Mallen-Orn clas (2003J) and 
iTinelev fc SacketH l)2005f) . These methods confirm that two 
of our most significant transit detections in this field, and 
one more marginal one, have transit properties fully consis- 
tent with those expected for planets with radii comparable 
to or somewhat smaller than Jupiter. 

The survey field chosen to illustrate the method is only 
one of more than 100 such regions surveyed during the course 
of 2004. Candidates from the other fields will be presented 
and discussed in subsequent papers. Together with the three 
candidates presented here, all likely planetary-transit can- 
didates will be subjected to high-resolution spectroscopic 
follow-up in the latter h alf of 2006, us i ng th e methodology 
employed successfully bv lBouchv et al.l l|2005t) for OGLE-III 
follow-up. 
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Table 2. Estimated physical parameters for transit candidates in 0143+3126 field. Parameter values inconsistent with planetary status 
are highlighted in bold type. IDs of stars disqualified on these grounds are shown in parentheses. 



Super WASP ID 


Vt(SW) 


V — K 




Sp. type 






V 


-^brighter 


^<5mag fainter 


HSWASP 1015721 21+333517 91 


10.982 


1.36 


6118 


F8 


1.18 


2.29 


0.44 


o 







12.986 


1.86 


5498 


G8 


0.89 


3.2 


0.58 


o 


2 


1SWASP J015711. 29+303447.7 


10.352 


1.19 


6354 


F6 


1.3 


1.38 


0.77 








(1SWASP J014228.76+335433.9) 


10.963 


0.93 


6740 


F2 


1.47 


2.46 


1.08 





1 


(1SWASP J014400.22+344449.2) 


11.164 


1.3 


6200 


F8 


1.22 


1.72 


0.88 





1 


1SWASP J015625.53+291432.5 


10.294 


2.3 


5044 


K3 


0.76 


0.72 


1.67 





1 


(1SWASP J014212.56+341534.4) 


12.247 


1.37 


6105 


F9 


1.17 


2.51 


0.74 





1 


(1SWASP J014211. 84+341606.5) 


12.359 


n/a 


n/a 


n/a 


n/a 


n/a 


n/a 


2 


2 


(1SWASP J012536.11+341423.8) 


12.336 


2.26 


5081 


K2 


0.77 


1.64 


1.07 





1 


1SWASP J014549.24+350541.9 


11.44 


2.45 


4908 


K4 


0.73 


0.56 


1.22 





2 


(1SWASP J014700.48+280243.6) 


11.764 


n/a 


n/a 


n/a 


n/a 


n/a 


n/a 


1 


1 
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